We're seen how NumPy can perform simple operations on arrays by dealing with the elements inside, for example by multiplying every element by 100. It also provides ways to combine together entire arrays, even if they don't exactly match their shape.
As ever, we need to start with importnumpy
:
import numpy as np
If you have two arrays which exactly match each other in shape then you can directly combine them. So, if we have one (2, 3)
array, grid_a
:
grid_a = np.array([[1, 2, 3], [4, 5, 6]])
grid_a
And another, grid_b
which is also (2, 3)
:
grid_b = np.array([[9, 8, 7], [6, 5, 4]])
grid_b
Then we can do any numerical or logical operation between the, just as we did with an array and a single number. For example, a multiplication will multiply the values element-by-element ($1\times9=9$, $2\times8=16$, $3\times7=21$ etc.):
grid_a * grid_b
If we define a different array with a different size, for example (3, 2)
:
grid_c = np.array([[5, 4], [6, 3], [7, 2]])
grid_c
Then the multiplication will not work as the shapes don't exactly match:
grid_a * grid_c
In a way, this makes sense. Exactly what did we expect to be the result of grid_a * grid_c
?
We see the term "broadcast" in that error message. We'll get back to that in a minute.
This doesn't seem very useful, if we can only ever work with exactly matching array shapes. Luckily, there are many situations where NumPy is able to combine arrays, even if they don't match. For example, we can take an array:
a = np.array([6.0, 2.1, 8.2])
a.shape
and multiply it with grid_a
:
grid_a * a
This works because it's combining an array grid_a
with shape (2,3)
with another array a
with shape (3)
. It's able to match them together by stretching a
so that its dimensions match grid_a
:
1 | 2 | 3 |
4 | 5 | 6 |
6.0 | 2.1 | 8.2 |
1 | 2 | 3 |
4 | 5 | 6 |
6.0 | 2.1 | 8.2 |
6.0 | 2.1 | 8.2 |
6.0 | 4.2 | 24.6 |
24.0 | 10.5 | 49.2 |
Note here we have switched to the row-format of one-dimensional arrays as it makes it easier to understand how they are combined.
This stretching operations is known as "broadcasting" in NumPy. There are a set of rules which govern what shape arrays can be combined with others which is detailed in the official broadcasting documentation.
So, if we try to combine grid_a
with an array of shape (2)
:
b = np.array([10, 10])
grid_a * b
Then it fails since it was unable to broadcast a (2)
to a (2,3)
.
There are ways to manipulate the arrays to make this work which are all covered in the documentation linked above. You might hope that it would see that it's combining a (2,3)
with a (2)
and stretch in the other dimension but the rules are designed to be predictable and simple so you can always reason about them without them being too clever and magic.
As a final example, when you do a * 5
NumPy is effectively behind the scenes automatically doing this stretching:
6.0 | 2.1 | 8.2 |
6.0 | 2.1 | 8.2 |
5 | 5 | 5 |
30.0 | 10.5 | 41.0 |
Once again, grab the "temperature"
array. Remember, this is in units of Kelvin and is three-dimensional with axes of altitude, latitude and longitude. The altitude axis is layered such that the 0th layer is ground-level and each layer beyond that increases in altitude.
with np.load("weather_data.npz") as weather:
temperature = weather["temperature"]
# masks
uk_mask = weather["uk"]
irl_mask = weather["ireland"]
temperature
data settemperature
data with the mask to extract only those values from within the UK.